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The problem of the plant breeder is the 
production of new varieties that are more de- 
sirable than those grown previously on the 
farms in his state or region. His objective is 
to combine into a single variety, or hybrid, as 
many desirable characters as possible, includ- 
ing yield. Only very large differences in yield 
can be judged visually, so that replicated field 
trials are necessary for comparing quantita- 
tively the yields of various strains. 


When the number of varieties, or strains, 
to be tested is small, the method of randomized 
complete blocks (8, 9, 11, 17) is the most 
common experimental design. This design is 
applicable for any number of replications. 
Missing plots may be interpolated easily. If 
several plots of a variety are missing, the var- 
iety may be omitted from the analysis and the 
Temaining varieties compared without bias. 


When the number of varieties is large, the 
greater soil variation between the plots within 
the larger blocks increases the error of the 
experiment. To overcome this difficulty the 
lattice designs have been devised (7, 21, 
22, 23). 


In addition to the control of field variation 
by the lattice design, standard varieties are 
often planted as checks between the blocks of 
the experiment. Frequent reference to these 
standards is highly desirable in observing dif- 
ferences in characters other than yield. New 
varieties which are inferior to the standards in 
any important character are discarded. 


In the simple and triple lattices (7, 12) 
each complete replication of k* varieties is di- 
vided into smaller blocks of k varieties each. 
Variations in yield due to the differences be- 
tween the smaller blocks in the productivity of 
the soil are removed from the error which 
otherwise would be used. The blocks of k 
plots are made the unit fcr error control in- 
stead of the replicates of k* plots. The number 
of replications must be multiples of two for 
simple lattices and multiples of three for 
triple lattices. 


Lattice square designs (2, 4, 23) have 
some of the features of Latin squares (9) in 
that the K* varieties are planted in the form of 
a k x k square in each replicate. Variations 
in soil fertility can then be eliminated both 
between rows and between columns in each 
square or replicate. The lattice squares most 
commonly used are those with 44(k+1) 
plications for 25, 49, 81 and 121 varieties, or 
with (k+1) replications for 9, 16, 25, 49 and 


64 varieties. 


Cubic lattice designs (21) will have blocks 
of k plots for k* varieties and require three re- 
plications or multiples of this number. With 
these designs as many as 1,000 varieties can 
be compared without requiring large individual 
blocks. 

Balanced lettice designs (18, 22) of the 
most useful type are those for testing 9, 16, 
25, 49, 64, 81 or 121 varieties in blocks con- 
taining the square root of the number of 
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varieties with k+1 replications of the 
varieties. These designs are said to be bal- 
anced since each pair of treatments occurs to- 
gether once in an incomplete block. Al] var- 
ieties are compared with equal precision. 

All of the above lattice designs may be 
analyzed legitimately as randomized complete 
blocks if desired (7, 22). If the soil varies 


crease in efficiency through the use of the 
lattice designs will depend on the number of 
varieties to be tested and the nature of the soil 
variation. In 20 corn yield trials in Iowa, 


compared with randomized 
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missing it is simpler to disregard the differ- 
between blocks within complete repli- 
The missing values are then com- 
the simpler methods for randomized 
locks or some varieties are omitted 
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may not be used in computing the experi- 
mental errors. Such designs lead to two errors, 
one applicable for comparisons of varieties 
in the same group and another for comparisons 
of varieties in different groups. Frequently 
these two errors are markedly different. Such 
designs are likely to prove inferior in efficiency 
to the lattice designs. 

For varietal trials combined with such 
tests as different fertilizers, dates of seeding 
and methods of disease control, split plot 
designs are very useful (12, 19, 20). For ex- 
ample, in some South American countries the 
planting of corn or wheat may extend over a 
long period of time. Varietal comparisons 
are often combined with tests on date of 
seeding. The most practicable field arrange- 
ment is to lay out large plots at random 
within each replicate for the different dates 
of seeding and randomize the varieties in sub- 
plots within each large plot. Split plot de- 
signs lead to two or more separate errors, 
each applicable to the test of significance for 
certain treatments. These errors usually differ 
in precision and practicability frequently dic- 
tates which are to be the main plots with the 
larger error. In other cases, the experimenter 
can determine through the design of the 
experiment the treatments on which he is 
willing to sacrifice precision in exchange for 
a lower error in some other comparisons of 
more vital interest. 


In Minnesota a variety is not recommended 
to the farmers until] it has been tested for 
three years in several parts of the state, and 
has been proved superior to existing varieties 
in one or more important characters and not 


— many plant breeders prefer to use several ran. 
‘ domized block designs each with not over 
aii 20 varieties rather than the larger lattices 
eth which include all varieties. One or more 
a in each randomized block experiment. 
: Another design for comparing a larger 
: bh less within the incomplete blocks than be- groups, the same grouping being used in all 
z ar lattice designs will give a smaller experimental replicates and the varieties are randomized 
. error than randomised complete blocks. The in- within groups. One or more standard varieties 
: are included in each group, even though they 
j 
rE Tete te Cochran (3) found that triple lattices with 
-_ three replications were somewhat more ac- 
curate than randomized blocks with five repli- 
cations. With lattice squares the increase 
im accuracy over randomized blocks repre- 
- Ae sented a saving of about one replication in 
aR six with 25 varieties, one in five with 49 or 81 
th varieties and one in three with 12] varieties. 
wy Wellhausen (18) found that the average eff- 
a) ciency of 60 balanced lattice designs with 
7 abe nine varieties of corn in four replications was 
146 percent 
Vy blocks. These were planted we agents 
| select uniform soil. Twenty-six lattice squares 
| 4 with 25 and 49 varieties had an efficiency of 
aoa 44 140 percent relative to randomized blecks. 
oe x The analysis of experiments in incomplete 
is 


deficient in any essential respect. Yield is 
considered only one of the important characters. 
The constancy of yield may be determined 
from the interaction of varieties by years and 
the uniformity of performance in different 
places in the state may be tested through the 
interaction of varieties by places as compared 
with the residual error (12, 14, 24). Varieties 
are discarded as ruthlessly for susceptibility to 
important diseases, deficiencies in an important 
agronomic character or for a serious lack of 
qualities for commercial use as for low yield. 


The incidence of some diseases can best 
be expressed as a percentage of infected plants. 
Since the error of a percentage, 100\/pq/n, 
depends on the expected proportion p, dif- 
ficulties mey arise in an analysis of variance. 
Especially if the range in the percentage is 
great, it may be necessary to transform the 
percentages into a form in which the errors 
are independent of the unit of measure. This 
is accomplished by transforming the _per- 
centages to an angle ¢, where the percent 


=100 sin’¢ as suggested by Bliss (1, 10). 
Such transformed data may be used in or- 
dinary analyses of variance. 


Correlation or regression methods are use- 
ful in studying the relations between charac- 
ters in varieties or strains at plant breeding 
nurseries. These methods may be combined 
with the analysis of variance to study by 
covariance (8, 9, 11, 17) the effect of diseases 
or of shriveling of the seed and other charac- 
ters upon yield. Many character interrelation. 
ships may be brought to light and this know- 
ledge used in the selection of desirable charac- 
ter combinations. 

Some characters are best expressed in 
categories. In studying their possible associa- 
tions the ,* test for independence has proved 
a very use — statistical tool (8, 17). 

Frequently, the mode of inheritance of 
new characters can be studied during the early 
segregating generations of a hybridization pro- 
gram. Genetic information can then be ob- 

(Continued on page 28) 
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NOTES ON ANALYSIS OF EXPERIMENTS REPLICATED IN TIME 


By H.G.Wiuim 
Forest Service, U. S. Dept. of Agriculture 


Since the interaction of treatment effects 
with such time factors frequently has a sub- 
stantial bearing on the general applications 
of experimental results, the repetition of ex- 
periments over several seasons or other periods 
of time has become a common practice. In 
field application this is an easy process; 
bat the analysis of resalts is not so simple, 
and some debate exists among investigators as 


of this experiment be scrutinized with some 
conservatism. 

The data from an actual experiment based 
on this design are presented in Table 1. In 
this particular case, the magnitudes of the 
variable studied (soil moisture deficits under 
a forest) were not perceptibly affected by any 
cumulative influence of time as expressed by 
observations taken in successive years; nor 
was such an influence to be expected during 
the period of study. This characteristic seems 
to be desirable for the main purpose of our 
present discussion, as it simplifies the explana- 
tion of principles and the analysis of the 
data. In many experiments, however, treat- 
ments such as fertilizers applied for a series 
of years on the same areas may exert a system- 
atic influence on crop yields or other factors 
being tested. In these cases the yields may 
exhibit variation which is correlated with time 
as well as random variation. The presence of 
such systematic variation does not alter the 
general principles of our analysis, but it 
does require an extra analytic step. 

Since the soil moisture experiment was re- 
peated with the same assignment of treat- 
ments to the plots each year, its structure is 
analogous to that of a split plot design in 
which the three “splits,” or years, within each 
plot exhibit variations associated only with 
time. Hence the form of analysis is simply 
that of any split plot experiment (7, pp. 72-75), 
as displayed in the left half of Table 2. 

In this type of analysis the total sum of 
squares is divided into two main portions: 

1. The sum of squares “between 
plots,” with 19 degrees of freedom in 

our example. This portion is fur- 

ther subdivided into the typical 

analysis of a randomized block ex- 
periment (4), in which the several 
sums of squares are calculated from 


the three year sums ined for each 
of the 20 plots. 

2. The sum of “within 
plots,” which is pen from the 
data for individual years within each 
plot and is subdivided to provide all 
the variances associated with years. 
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A 
In experiments designed to test the effects 
re of various treatments upon some factor such 
iat as crop yield, the investigator may find it de- 
he th sirable to repeat his studies for several years 
fee © on the same study areas. The purpose of such 
art out whether changes in climate, or in other 
real influence upon the general effect of the 
: ( treatments that are under scrutiny; whether the 
magnitade of treatment effects fails to re- 
main the same in different years. 

4 wf to the most desirable method. Although ar- 
ai stu ticles on different aspects of this topic have 
a nm appeared in British journals, (1, 2, 8) the 
fi 3 general principles of this kind of analysis 
ay ‘i a have not been generally available to American 
investigators. Hence it seems desirable that 

i ome research worker who has labored inde- 

" problem submit his interpretation of these 

Ake principles and an outline of the appropriate 
analytic method. 

As typical example, we may consider 

Ty. y a randomized block design containing four 

WT gle blocks of five plots with five treatments as- 

mee ane block ; the treatments have been repeated with- 

out rerandomization en the same plots over a 

on eral series of three years. As usual, it will be 
amen ! convenient to assume for the sake of our 

am analysis that the four blocks and these three 
Ey years represent a random sample drawn from 

a = @ population of blocks that are generally simi- 

& Pee. lar to ours, and from a longer series of 

oe elt ma years to which our results might be applied. 

. 


As in any split plot experiment, each of 


the two major parts of this analysis is 
characterized by its own “experimental error” 
mean square, which may be employed to test 
the real nature, or significance, of the com- 
Parisons contained within that part of the 


Table 1. 


whole analysis. In the first part, for example, 
the treatment mean square may be compared 
with that calculated for the interaction of treat- 
ments with blocks (place error, Table 2), 
which provides an estimate of the failure 
of the treatments to behave alike from place to 


Soil Moisture Deficits as Affected by Timber Cutting 


(All data in inches of water) 


* Treatment 


Block Treatment 
Year A B C D sums 
Uncut (11,900) 
1941 2.40 0.98 1.38 L237 6.13 
1942 3.32 1.91 2.36 1.62 921 
1943 2.59 1.44 1.66 Ls 7.44 
Sum 8.31 4.33 35.40 4.74 22.78 
6,000 
1941 1.76 1.65 1.69 1.11 621 
1942 2.78 2.07 2.98 2.50 10.33 
1943 2.27 2.28 2.16 2.06 8.77 
Sum 6.81 6.00 6.83 5.67 25.31 
1941 1.43 1.30 0.18 1.66 457 
1942 2.51 1.48 1.83 2.36 8.18 
1943 1.54 1.46 0.16 1.84 5.00 
Sum 5.48 4.24 2.17 5.86 * 
2000 
1941 1.24 0.70 0.69 0.82 3.45 
1942 3.29 2.00 1.38 1.98 8.65 
1943 2.67 1.4 bay 1.56 7.42 
Sum 7.20 4.14 3.82 4.3% 19.52 
None 
1941 0.79_ 0.21 0.01 0.16 1Aaz 
1942 1.70 1.4 2.65 2.1 7.94 
1943 1.62 1.26 1.36 re 6.11 
Sum 4.11 2.91 4.02 4.18 15.22 
Block Sums 
194) 7.62 4.84 3.95 §.12 21.53 
1942 13.60 8.90 11.20 10.61 44.31 
1943 10.69 7.88 7.09 9.08 H74 
Sum 31.91 21.62 22.24 24.81 100.58 


* Expressed as volume in board-feet of trees larger than 9.6 inches in diameter, which 


were left in the forest after treatment. 
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Table 2. Analysis of variance, Soil moisture experiment 


s. of Degrees of Mean Pure variances contained in mean square 


Total 59 0.5577 
(Between plots) (19) 
(t) Treatments 4 1.3333 12 3 4 1 
(b) Blocks 3 1.4832 15 3 5 1 
(tb) Treatments x 
Blocks: 12 0.3909 3 1 
“Place” error 
(Within plots) (40) 
(y) Years 6.5418 20 4 5 1 
(ty) Treatments x 
yon: ” error 8 02554 4 1 
(by) Blocks x years 6 0.1294 5 1 
(tby) Triple interaction 24 0.1053 1 


place in the population sampled. And in the 
second part, the mean square for “years” and 
the interactions associated with years may be 
compared with a second error term, provided 
by the triple interaction. 


For the purpose of our present experiment, 
the most interesting feature of the “with- 
in plots” analysis is the interaction of treat- 
ments with years (time error, Table 2). Since 
this mean equare estimates the failure of the 
treatments to behave alike from year to 
year, it provides an error term in the time 
dimension which, like the treatment-block in- 
teraction, may be used to test the significance 
of treatment effects. 


We have outlined the general form of 
analysis for this kind of experiment, have 
partitioned it into its appropriate comparisons, 
and have calculated « mean equare for each. 
With obvious minor changes, the same prin- 
ciples may be applied to other experimental 
designs. Before we proceed to test the signif- 
feance of treatment effects, however, we must 
consider one further complication, which may 
be clarified by inspection of the internal struc- 
ture of the mean squares obtained in our 


. analysis of variance. As shown by Fisher (3, 


Sec. 40) and elucidated by Tippett (5) and 
Winsor and Clarke (6), each mean square con- 
tains one or more “pure” variances, each esti. 
mating a single component of variation such 
as the effect of treatments or the interaction 
of treatments with blocks; and each variance 


is included one or more times in the mean 
equare, the number of times being equal to 
the number of observations contained in the 
sums from which the mean square is com- 
puted. Thus the triple interaction contains 
only a single variance, taken only once be- 
cause this mean square was calculsted from 
single observations. And the treatment-block 
interaction, in our analysis, contains the triple 
interaction variance and also the pure treat- 
ment-block variance; the latter is included n 
times in the mean square, with n equal to the 
number of years contained in each of the 
20 sums from which this mean square was 
calculated. Hence the treatment-block mean 
square may be expressed as ; 
Viv = + 


where s* is an estimate of a pure variance, and 
the subscripts indicate the character of the 
mean square or variance. Similarly, the 
treatment-year mean square may be stated 
as 

Vex = + 


where g is the number of blocks contained in 
each sum from which this mean square was 
calculated. And, as a final example, the 
treatment mean square contains a series of 
variances: that due to the pure effect of the 
treatments, and the pure variances which esti- 
mate the effects of all the interactions as- 
sociated with treatments: 


Vi = + + + 
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Expressed in terms of the particular ex- 
periment under discussion (with five treat- 
ments, four blocks, and three years), these 
equations become 

Veo = + 

Vey = + 

Vv. = (4) (3)s* + + 
In these quantitative terms, the contents of 
all the mean squares in our analysis are 
presented in the right half of Table 2. 


Assuming that each of the pure variances 
in the above three equations is significantly 
greater than zero, each contributes a real 
amount to the several mean squares. This 
fact brings us to the main point of our dis- 
cussion: either of our two error terms for 
testing treatment effects (Vie or Vey) con- 
tains only two components of error, while the 
treatment mean square contains four—no only 
the pure variance associated with treatment 
and those contained in either one of the 
error terms, but also an additional interaction 
variance which is not directly associated with 
the treatment effects to be tested, but is a 
part of the other error term. In itself this 
fact does not invalidate our tests; Dut it does 
call for one further step in analysis. When 
we have proceeded this far, we can ask two 
alternative questions (keeping in mind the 
need for conservatism imposed by our initial 
assumption that these blocks and years ac- 
tually represent a random sample of popula- 
tions in space and time) : 

1. In years that are similar to 
those included in our experiment, are 

the treatments likely to exert a real 

effect if applied in other blocks in 


the population of which our blocks 
are a sample? and 


2. On the average for blocks that 

are generally similar to ours, are the 

treatments likely to exert a real effect 

if applied in other years contained in 

the time population of which our 

three years are a sample. 

In order to answer the first question, 
we calculate the values estimated by our ex. 
periment for each of the several pure variances 
(see Table 2); subtract the treatment-year 
component (qs*,) from the treatment mean 
square; and test the residual mean square by 
comparison with the treatment-block mean 


square. The second question requires a simi- 
lar procedure, except that we subtract the 
treatment-block component and use 
the treatment-year mean squaze as the error 
term. 


For the present experiment, these pro- 
cedures may be expressed quantitatively as 
follows: 

= 1.3333 — 0.1501 
= 1.1832 


V.- 
Thence and 
Vee 0.3909 


2) Vu = 12s", + > 
=V. — 
= 1.3333 — 0.2856 
= 1.0477 


Ve 1.0477 
Thence F’ =——=: 
0.2554 


These calculated values for F’ and F” will not 
follow the mathematical distribution of F 
exactly; and the number of degrees of free- 
dom which should be eseociated with the 
two comparisons has not yet been clarified. 


Estimates of significance obtained from 
these two tests should not be much in error, 
however, if they are based on somewhat fewer 
degrees of freedom than those indicated in 
the analysis. 


The steps outlined above sre, of course, un- 
necessary if the treatment-year (or treatment- 
block) mean square proves not to be signifi- 
cantly greater than the triple interaction, so 
that the associated pure variance does not 
contribute a significant amount to the treat- 
ment mean square. 

If desired, one more analytic step may be 
taken in order to derive a maximum quantity 
of information from the analysis. Where- 
ever the treatment-block or treatisent-year 
mean square ind ates a real interaction of 
the treatments wth either place or time, 
the above procedw s may be usefully supple- 
mented by the -parate analysis of each 
block or year, ac umpanied by scrutiny of the 
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Table 2. Analysis of variance, Soil moisture experiment 


‘ati Degrees of M. 
Seurce of variation es ean 


Pure variances contained in mean square 


Total 0.5577 
(Between plots) (19) 
(t) Treatments 1.3333 12 3 4 1 
(b) Blocks 3 1.4832 15 3 5 1 
(tb) Treatments x 
Blocks: 12 0.3909 3 1 
“Place” error 
(Within plots) (40) 
(y) Years 2 65418 20 4 5 1 
(ty) Treatments x 
anes ” error 8 025% 3 1 
(by) Blocks x a 6 0.1294 5 1 
(tby) Triple interaction 24 0.1053 ; 1 


place in the population sainpled. And in the iis included one or more times in the mean 


second part, the mean equare for “years” and 


by the triple interaction. 


For the purpose of our present experiment, 
the most interesting feature of the “with- 
im plots” analysis is the interaction of treat- 
ments with years (time error, Table 2). Since * 
this mean square estimates the failure of the 
treatments to behave alike from year to 
year, it provides an error term in the time 
dimension which, like the treatment-block in- 
teraction, may be used to test the significance 
of treatment effects. 


We heave outlined the general form of 
analysis for this kind of experiment, have 
Partitioned it into its appropriate comparisons, 
and heve calculated a mean equare for each. 
With obvious minor changes, the same prin- 
ciples may be applied to other experimental 
designs. Before we proceed to test the signif- 
icance of treatment effects, however, we must 
consider one further complication, which may 
be clarified by inspection of the internal struc- 
ture of the mean squares obtained in our 
analysis of variance. As shown by Fisher (3, 
Sec. 40) and elucidated by Tippett (5) and 
Winsor and Clarke (6), each mean square con- 
tains one or more “pure” variances, each esti. 
mating a single component of variation such 
as the effect of treatments or the interaction 
of treatments with blocks; and each variance 
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equare, the number of times being equal 


interaction variance and also the pure treat- 
ment-block variance; the latter is included n 
times in the mean square, with n equal to the 
number of years contained in each of the 
20 sums from which this mean square was 
calculated. Hence the treatment-block mean 
square may be expressed as 


where s* is an estimate of a pure variance, and 
the subscripts indicate the character of the 
mean square or variance. Similarly, the 
treatment-year mean square may be stated 
as 


Vex = + 


where g is the number of blocks contained in 
each sum from which this mean square was 
calculated. And, as a final example, the 
treatment mean square contains a series of 
variances: that due to the pure effect of the 
treatments, and the pure variances which esti- 
mate the effects of all the interactions as- 
sociated with treatments: 


Vi. = qns’s + + + 


to 
associated Wil Years may De the number of observations contained in the 
: Mints e compared with a second error term, provided sums from which the mean square is com- 
ee es puted. Thus the triple interaction contains 
only a single variance, taken only once be- 
4a AAAS cause this mean square was calculated from 
single observations. And the treatment-block 
aoe : interaction, in our analysis, contains the triple 
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Expressed in terms of the particular ex- 
periment under discussion (with five .reat- 
ments, four blocks, and three years) these 
equations become 


Vey = + 

Vv. = (4) (3) 8". 4+ 
In these quantitative terms, the contents of 
all the mean squares in our analysis are 
presented in the right half of Table 2. 


Assuming that each of the pure variances 
in the above three equations is significantly 
greater than zero, each contributes a real 
amount to the several mean squares. This 
fact brings us to the main point of our dis- 
cussion: either of our two error terms for 
testing treatment effects (Vie or Viz) con- 
tains only two components of error, while the 
treatment mean square contains four—na only 
the pure variance associated with treatment 
and those contained in either one of the 
error terms, but also an additional interaction 
variance which is not directly associated with 
the treatment effects to be tested, but is a 
part of the other error term. In itself this 
fact does not invalidate our tests; but it does 
call for one further step in analysis. When 
we have proceeded this far, we can ask two 
alternative questions (keeping in mind the 
need for conservatism imposed by our initial 
assumption that these blocks and years ac- 
tually represent a random sample of popula- 
tions in space and time) : 

1. In years that are similar to 
those included in our experiment, are 

the treatments likely to exert a real 

effect if applied in other blocks in 


the population of which our blocks 
are a sample? and 


2. On the average for blocks that 
are generally similar to ours, are the 
treatments likely to exert a real effect 
if applied in other years contained in 
the time population of which our 

years are a sample. 


In order to answer the first question, 
we calculate the values estimated by our ex- 
periment for each of the several pure variances 
(see Table 2); subtract the treatment-year 
component (qs,) from the treatment mean 
equare; and test the residual mean square by 
comparison with the treatment-block mean 


equare. The second question requires a simi- 
lar procedure, except that we subtract the 
treatment-block component (ns) and use 
the treatment-year mean square as the error 
term. 


For the present experiment, these pro- 
cedures may be expressed quantitatively as 
follows: 

1) =12s', + + 
= 


Thence 


2) = 12s*, + + 
= Vi — 
= 1.3333 — 0.2856 


= 1.0477 


1.0477 
Viy 0.2554 


Thence 


These calculated values for F’ and F” will not 
follow the mathematical distribution of F 
exactly; and the number of degrees of free- 
dom which should be essociated with the 
two comparisons has not yet been clarified. 


Estimates of significance obtained from 
these two tests should not be much in error, 
however, if they are based on somewhat fewer 
degrees of freedom than those indicated in 
the analysis. 


The steps outlined above sre, of course, un- 
necessary if the treatment-year (or treatment- 
block) mean square proves not to be signifi- 
cantly greater than the triple interaction, so 
that the associated pure variance does not 
contribute a significant amount to the treat- 
ment mean square. 


If desired, one more analytic step may be 
taken in order to derive a maximum quantity 
of information from the analysis. Where- 
ever the treatmrnt-block or treatient-year 
mean square ind ates a real interaction of 
the treatments wh either place or time, 
the above procedw s may be usefully supple. 
mented by the -parate analysis of each 
block or year, ar ompanied by scrutiny of the 
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= 1.3333 — 0.1501 
= 1.1832 
Vi. 1.1832 
View 0.3909 
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data and of any characteristics peculiar to 
each plot and year. The purpose of such 
analysis is to explain, if possible, the nature 
of the interaction and the reasons for its 
significance. In the experiment under dis- 
cussion, for example, the effects of timber cut- 
ting were found to fluctuate significantly from 
year to year; and these fluctuations were found 
to depend largely on the amount and distribu- 
tion of the antecedent summer rainfall which 
reached the soil through the forest canopy 
remaining after treatment. 


As indicated in our introductory die 
cussion, the variation between years and the 
interaction of treatments with years may often 
contain @ systematic as well as a random com- 
ponent, the former expressing a cumulative 
effect of past treatments or perhaps only 
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The physical anthropologist employs ra- 
cial characteristics, the heredity of which is 
most uncertain. The serologist has the great 
advantage of studying blood properties which 
are inherited according to simple mendelian 
laws. This holds true especially for the blood 
properties A and B (the four blood groups) 
and the factors M and N. A considerable im- 
petus to the study of the geographical distri- 
bution of blood properties was supplied re- 
cently by the discovery of the Rh factor and 
its importance in a specific form of selective 
fetal and neonatal morbidity (erythroblastosis 
fetalis). 


ods Willia: 
WwW C. P. and G. L. Clarke. A statistical Ge Ak varia 
nets. Sears Foundation: Jour Res. 
F. The design and analysis of 


a carry-over effect of the preceding year's 
treatments. In such cases, as outlined by 
Cochran and others (1, 2, 8), the systematic 
component may be isolated by covariance anal. 
ysis. In thie additional step, the sums of 
squares connected with years are broken down 
into components that are associated with and 
independent of the regression of treatments on 
some factor which expresses the systematic 
time effect. Where a cumulative effect is sus- 
pected, the logical procedure is to fit one or 
more terms of the regression of treatments on 
years, employing the method of “orthogonal 
polynomials” perfected by Fisher (4, Sec. 
14.6). And in cases where only a carry-over 
effect is suspected, the experimental results 
may be adjusted by a regression of the cur- 
rent year’s results on those of the preceding 


year. 
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THE GEOGRAPHICAL DISTRIBUTION OF GENES DETERMINING 
INDIVIDUAL HUMAN BLOOD DIFFERENCES 


Puiuip Levine, M.D., F. A. C. P. 
Ortho Research Foundation 


The accepted theory of the heredity of 
the AB system was discovered by Bernstein 
(a mathematician who never carried out any 
blood tests) by the analysis of gene fre- 
quencies for several racial (geographic) groups 
characterized by varying incidence of the four 
blood groups. 


Much has been contributed to the early 
history of man by geographic studies of only 
one set of three allelomorphic genes (O, A 
and B). The two factors most important in 


explaining the varying distribution of the 
blood groups are first isolation and then mix- 
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ture. In the very early history of man, the 
isolation of very small groups with the loss 
of the rarer gene for the factor B and the oc- 
casional lose of the more frequent gene for the 
factor A resulted in populations which are puz- 
zling to the anthropologist, i.e., identical ra- 
cial group with striking differences in the in- 
cidence of the A and B factors and entirely 
different races with very similar distribution of 
the blood groups. These serve as excellent 
examples of Wright’s mechanism of gene fixa- 
tion, the geographic distribution of which is 
largely or entirely accidental. 


Even the moet striking discrepancies in 
any race observed in the studies with the 
factors, A and B may be reconciled by the find- 
ings with other blood factors, for it is improb- 


tribes of American Indians, one having a high 
incidence of group O, and the other with a 
preponderance of group A, have equally high 
values for the factor M and both tribes have 
equally low incidence of taste-blindness to 
para-ethoxy-pheny]-thio-urea. 


Although systematic studies are lacking, 
there is considerable information with regard 
to the geographic distribution of the sub- 
groups of group A and factor P of Land- 
steiner and Levine. Thus, American Indians 
having a high incidence of group A are almost 
exclusively A:, a random white population of 
New York City has 5 or 6 times as much A; 
as A; while Negroes in the same area have 
somewhat less A, than As The factor P is 
much more frequent in Negroes than in white 
individuals. These studies are significant even 
though these factors are not as well defined 
serologically as the factors A, B, M, N or Rh. 


Investigations on the Rh factor have al- 
ready revealed its significance in racial rela- 
tionships. Its role in the pathogenesis of ery- 
throblastosis fetalis is now well estabilshed. 
The view has been expreseed tha: the Rh—rh 
gene is responsible for more fetal and neonatal 
deaths than any other gene difference known. 
Because of the morbid effects specifically on 
heterozygous infants (Rhrh) ,the stability of the 
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Rh gene in any population is of considerable 
importance especially in those races with a 
high incidence of rhrh individuals. The racial 
incidence of this disease is directly propor- 
tional to the value of rhrh individuals in any 
given population. Thus, corresponding to the 
value of Rh negative (rhrh) individuals in 
whites, Negroes and Chinese, 15%, 5 to 
8%, and 1% respectively, the disease is three 
times more frequent in whites than in Negroes 
and is almost entirely unkn«77= among Chinese. 
This selection against the heterozygous and ac 
tually against the lese frequent recessive gene, 
which follows from Levine’s theory, has been 
discussed recently by many noted scientists, 
Moet workers support the view that the genes 
are of comparatively recent origin so thet 
they have not yet reached a state of equilib- 
rium. 


The current genetic theory is that the vary- 
ing types of Rh reactions (phenotypes) ie deter- 
mined by a series of multiple allelomorphs. 
A complex antigenic and corresponding genetic 
structure is to be expected im any facter dem- 
onstrated by isoimmaunization, im contrast te 
the simpler antigenic and genetic scheme of 
the MN system demonstrable by heteroimmani- 
zation. Racial studies on a vast scale similar 
to those on A and B should yield much more 
significant data then that yielded by gee 
graphic studies of M and N or perhaps even 
A and B. As with the four bloed groups, the 
correct and final theory of the heredity ef the 
Rh system will emerge from or be confirmed 
by statistical analysis of the distribution of 
the various Rh subtypes in many races. Such 
studies should be moet valuable in spite of 
selection against the recessive type since sev- 
eral races are already known to have a very 
low incidence of Rh negative individuals 
(American Indians, Chinese, and Japanese). 


Selective fetal death induced by the factors 
A and B, or by any other fectors like Rh 
capable of inducing isoiramunization of the 
mother by the fetus, must now be considered 
in genetic and geographic studies cf all racial 
groupe. 
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Since every statistical method has a great 
variety of possible applications, the traditional 
practice of teaching statistical methods as 
if they were branches of one or another 
of the -pplications is evidently doomed. The 
teaching arrangements prevailing in the past 
might be compared to the teaching of chem- 
istry, zoology, anatomy or bacteriology in an 
imaginary medical school as incidental ac- 
tivities of the departments concerned with 
the various kinds of disease, each department 
teaching as much of these sciences as it 
considered necessary for the treatment of its 
own disease. In such a medical school chem- 
istry might be taught in the Department of 
Cardiology by a cardiologist, and quite inde- 
pendently in the Department of Obstetrics by 
an obstetrician. Each department would see 
to it that its own instructor in chemistry really 
knew the disease to which that department was 
consecrated, but it would regard chemistry in 
general as a very minor concern. It would zct 
object if its min happened also to be a decent 
chemist, provided he did not wander off 
into chemistry so far as to give the impression 
thet his feet were not solidly on the ground. 
Sach a school would do little for the advance- 
ment of chemistry. Its students would not 
have the benefit of chemical instruction as 
accurate, complete and modern as could be 
supplied by genuine specialists in chemistry. 

The advantages of specialization and di- 
vision of labor point clearly to the future 
teaching of statistics by specialists in the 
subject, a class devoted to the increase and 
diffusion of knowledge regarding statistical 
methods and theory. Only such specialists, 
removed from pressure to devote too much of 
their time to particular applications, can hope 
to concentrate sufficiently upon the subject- 
matter of statistical theory and method to 
purify it of its errors, explore and strengthen 
its foundations, build up its superstructure, 
and transmit it as a living body of knowledge 
rather than as an old and defective tool to 
the newer generation. Such a scholarly group 
must be organized around its own subject, 
statistics. 

While the future of statistical teaching will 
thas be in the hands of specialists, there are 
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GRADUATE WORK IN STATISTICS AT COLUMBIA UNIVERSITY 


difficulties about the transition to that future. 
If all the colleges and universities in the 
country should now undertake to put into 
effect the change indicated, they would find 
it impossible for the simple reason that there 
do not exist specialists in statistical theory 
and methods in anything like sufficient num- 
bers. A necessary preliminary is the develop- 
ment of the speciclists. The problem of ob- 
taining competent scholars specializing in 
statistics for the college and university fac- 
ulties is made more difficult by the strong 
demand from industry and from government, 
as well as from research organizations of varied 
types, for individuals having practically the 
same type of preparation and ability. 
Columbia University has undertaken a pro- 
gram of graduate work in statistics designed 
for the preparation of scholars who will work 
on the highest levels. Graduate students may 
enroll for the Ph.D. degree in statistics under 
the supervision of an interdepartmental com- 
mittee. Each student’s work is arranged to 
include pure mathematics, mathematical sta- 
tistics, and a field in which statistical methods 
can be applied. The relative time allotted 
to study under each of these heads varies from 
individual to individual according to previous 
education and experience. All must go through 
a closely integrated series of courses in mathe- 
matical statistics, beginning with the logical 
and mathematical theories of probability and 
proceeding through the various major divisions 
of statistical theory. In these courses the main 
emphasis is on exact statement and careful 
derivation of statistical principles and methods, 
with the limitations of each method made clear 
in discussing the assumptions underlying the 
derivations. Attention is given to the prac- 
tical devices and approximations that have to 
be used in the absence of suitable agreement 
between empirical situations and the assump- 
tions underlying standard methods, or on ac- 
count of the inadequacy of existing mathe- 
matics. These courses also include diversified 
practical examples, and some training is given 
in computing and in careful writing. They 
enable the student to acquire a certain amount 
of facility in practical statistical work, and also 
bring him to the threshold of research by 
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pointing out the large number of unsolved 
problems that confront the statistician. 


In preparation for research, students are 
encouraged to include as much pure mathe- 
mathics as possible in their studies. A really 
good command of calculus is necessary be- 
fore beginning the curriculum in statistics, and 
elementary matrix algebra and theory of 
functions are studied in the first yeer by 
those who have not had them earlier. The 
Department of Mathematics also provides 
courses in more advanced mathematics of value 
in statistics, including among others finite dif- 
ferences and elementary and advanced differen- 
tial equations, the last involving some special 
functions used in statistics. 


Work in a field of application is particu- 
larly stressed in the case of students who have 
concentrated heavily on mathematics as un- 
dergraduates. The aim is not so much to de- 
velop statistical economists, psychologists, or 
biological research workers as to provide the 
statistician with the essential equipment for 
cooperating with specialists in at least one em- 
pirical field, and for bridging the gap between 
them and mathematics. The chief aim of this 
part of the training is to make the mathe- 
matical statistician so definitely aware of the 
kind of situation facing the practical worker 
that the former will concentrate his ingenuity 
on the provision of tools of real value to the 
latter. The relation is felt definitely to be 
that of tool-maker and tool-user; the maker 
must know well the uses of his tools in order 
to design them efficiently. The actual fields 
of application chosen by students are diverse. 
One student selected life insurance and ac- 
tuarial work, another vital and public health 
statistics, others genetics, industrial quality 
control, economics, and other subjects. Study 
of these fields is sometimes in courses in 
the relevant university departments or some- 
times by individual guided study in the case 
of students who have a strong foundation in a 
particular field acquired in undergraduate 
study. In such a field as industrial quality con- 
trol, practical experience in the Bell Telephone 
Laboratories or other companies having well. 
organized quality control departments may 
meet the basic need. 


Most of the classes in mathematical sta- 
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tistics meet in the late afternoon or evening, 
and many of the students hold jobs, either 
full- or part-time, often of such a nature as 
to bring them into contact with practical prob- 
lems involving statistics. Still, candidates for 
the doctorate are warned that they will need 
at least one year, and prohably more, of 
full-time graduate study at the university, in 
addition to study that can be carried on while 
holding a job. 


The classes in mathematical statistics in- 
clude many students who are not candidates 
for the doctorate in statistics, but are study- 
ing the subject for the sake of its applications 
to their major subjects or vocations. In normal 
times these courses are taken as electives by 
numerous undergraduates, and by students 
under the various graduate and professional 
faculties. There are groups of graduate stu- 
dents majoring in education and psychology, in 
economics and business, in philosophy, mathe- 
matics, engineering, physics, chemistry and 
biology of various kinds, with a few in 
sociology, history and scattered subjects. All 
are admitted if they have the necessary mathe- 
matical prerequisites. 


Columbia University has no master’s de- 
gree in statistics as such. However many 
students working largely in mathematical sta- 
tistics receive M.A. degrees in economics, 
mathematics or other departments. 


The group working in mathematical sta- 
tistics at Columbia University engages in re- 
search in statistical theory and methodology. 
Its members also advise and collaborate 
with numerous members of the university 
faculty and others regarding the statistical 
aspects of their work and the design of their 
experiments. The membership of this group 
of faculty and students overlaps that of the 
Statistical Research Group, a university or- 
ganization dealing with war problems referred 
to it by the government. 


The  interdepartmental committee in 
charge of the program for candidates for the 
degree of Doctor of Philosophy in Statistics 
consists of Dean George B. Pegram and Pro- 
fessors F. E. Croxton, R. S. Lynd, F. C 
Mills, J. F. Ritt, Abraham Wald, Helen M. 
Walker, and Harold Hotelling (chairman). 
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NEWS AND NOTES 


The Bureau of ships has three biomet- 
ricians from Cornell plotting its course. Lt. C. 
McC. Mort ey, formerly Associate Professor of 
Limnology and Fisheries, is head of the Op- 
erational Analysis Sub-section with the Quality 
Control Section of the Research and Standards 
Branch of the Bureau in Washington. With 
him is Lt. Warter C. Jacos, formerly research 
assistant in the Department of Vegetable Crops 
(seafood), New York State College of Agri- 
culture. He spent some time as an admini- 
strative officer at a Naval Training School in 
Richmond before coming with the Bureau 
where he is applying statistical methods in 
the development of materials for naval use. 
Also engaged in this work is Lt. Dante: R. 
Emsopy. He was an instructor in Limnology 
and Fisheries and spent a year at sea before 
he joined this group. 


M. Sanpomme, who was statistical 
editor of the U. S. Department of Agriculture 
Publication Division, spent 20 months in The 
Inspector General's Office, where she was in 
charge of the statistical aspects of surveys of 
automotive maintenance in the army. She 
has been with the Navy Department, Bureau 
of Ships, for the past year, applying statistical 
methods to research problems. 


In June, D. B. DeLury will join the stat- 
istical staff of the Virginia Polytechnic In- 
stitute and the Virginia Agricultural Experi- 
ment Station at Blacksburg. He received his 
degrees from the University of Toronto and 
spent a year of post doctorate work with Harold 
Hotelling at Columbia University. He has 
taught at Saskatchewan, University of Toronto 
and has been statistical consultant for several 
government agencies. A generous grant from 
the General Education Board has made pos- 
sible the expansion of the statistical work 
at Blacksburg. 


Associate Director F. R. Imozn has re- 
turned to St. Paul from one of those secret 
missions. During the trip he talked with R. A. 
Fisuer, who now has the chair of Genetics 
in Cambridge University. A direct report 
says that F. R. Immer returned with a couple 
of very good stories for our reunion at the 
next Biometrics Section meeting. With so many 
section members abroad, things should be go- 
ing well over there! A. E. Branor reports 
“While I was in England fighting the cold, 
the fog, and the smoke—.” And we thought 


he was helping with the war! To this, add 
the fact that his chief recreation has been 
wild boar hunting. One wonders—until it 
is noted that he met Frank Yates and Fred- 
erick F. Stephan in Paris. Official approval 
was granted for the report that Franx YATEs 
has a war job as statistical advisor to a 
general scientific advisor to an important 
military man!—an important job which has 
an immediate bearing on operations. It seems 
to be no secret, for it was in print, that Frep- 
Erick F. STEPHEN is in Paris to participate 
in a special bombing survey for the Air Trans- 
port Command .. . In the usual confidential 
manner it can be stated that Rensis Lixert is 
in the European Theater of Operations on a 
special research assignment for the War De- 
partment ... If anyone knows where W. J. 
Youpen is now, keep it a secret! A log of 
his activities since those 7:00 A.M. classes at 
Iowa State during the summer of 1942 may be 
expected at some future date. D. B. DeLury 
especially wishes a report on who gets him 
up in the mornings . . . During the absence 
of Churchill Eisenhart, statistician of the 
University of Wisconsin, J. H. Torrie is acting 
as statistical advisor for the Agricultural Ex- 
periment Station. He has been giving courses 
in statistical methods and in experimental de- 
sign. Dorothy MosHer, a_ graduate in 
Mathematics at the University of Wisconsin, 
has been appointed. assistant in the statistical 
office there... E. W. Linpstrom, head of 
Genetics Department at Iowa State 
Colloge, is in Medellin, Colombia, South 
America, lecturing in the College of Agricul- 
ture and helping to initiate a research program 
in plant genetics . . . A report came last Jan- 
uary that R. A. FisHEeR was about to depart for 
India for several months. Does someone want 
to verify the report? Joun Wisnart, M. G. 
Kenpat, Braprorp and others are work- 
ing for the war but not directly with military 
operations .. . ALan E. Trevoar of the Uni. 
versity of Minnesota has joined the staff 
of the Statistical Research Group. You might 
look for him from Maine to Florida... 
Gren Burton, Coastal Plain Experiment Sta- 
tion, visited North Carolina State College 
recently. And during his stay the local re- 
search workers in genetics and plant breeding 
had a luncheon meeting with him. Since 
1941, J. Neyman has been director of the 
Statistical Laboratory at Berkeley, California. 
The Laboratory as a unit is busy on war work 
under the Applied Mathematics Panel. Four 
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members of the Laboratory, Georce B. Dant- 
zc, F. W. Marx W. Eupey and 
Eaich LEHMANN are with the services but 
hope to return as soon as circumstances per- 
mit .. . Fryer, statistician at Kansas 
State College, is on leave to serve with The 
Division of War Research at Columbia Univer- 
sity. When he wrote, he was “deep in the 
heart of Texas” but his family is in Tenafly, 
New Jersey. We don’t know where he is 
registered for Selective Service so consider that 
all above addresses may or may not be correct 


. We are being told repeatedly that im 
portant edvances in statistical theory and prac- 
tice have been worked out by the various 
groups. A member of one group states, “All 
of us have had plenty of new problems which 
could not help influencing our thinking and, 
in time, many novelties are likely to appear 
in the literature available to all”... How 
about your sending some news items—that 
is, if you want correct news about yourself 
reported! 


QUERIES 


QUERY What is the error to be used for test- 
ing the significance of treatments in this ex- 
periment? Six treatments were applied to 
laying hens in 12 cages each containing 10 
birds. Each treatment was given in two cages 
chosen at random. 


ANSWER The experimental error is the mean 
square, 575.9, for cages receiving the same 
treatment. The value, F = 1,074.5/575.9 = 
1.87, with degrees of freedom n: = 5 and 
= 6, is emall as compared to the 5% point, 
4.39. Hence, in the absence of any further in- 
formation, one concludes that these treatments 


with different treatments; hence, the large 
experimental error. 

The testing of significance in a table like 
that above is made clearer to some by sub- 
division of the mean squares into parts repre- 
senting the three sources of variation assumed 
present. First, there is the variation in egg 
production by individuals occupying the same 
cage, estimated by s* = 297.8. For the pur- 
poses of this answer, this will be taken as com- 
mon to all cages. Next, there is the average 
variation of production in pairs of cages 
treated alike, estimated by s:°. The mean 
square for cages receiving the same treatment 


Analysis of variance of numbers of eggs laid 


So Ba Degrees of Sum of Mean 
urce of variation { 

reedom squares square 
Treatments 5 5,372.3 1,074.5 
Cages receiving same treatment 6 3,455.6 575.9 
Hens in same cage 108 32,164.7 297.8 


may have no effect on the production of eggs. 
The analysis of variance contains some 


additional information. Apparently there were . 


inequalities in the environments of the cages. 
Evidence is found in the ratio, F = 575.9/ 
297.8 = 1.93, df =6 and 108, which is just 
below the 5% point, 2.19. This leads one to 
suspect that such things as light, humidity, air 
currents, temperature and incidence of com- 
municable diseases may have affected egg pro- 
duction. If these things differentiated pro- 
duction in cages receiving the same treatment, 
they undoubtedly affected production in cages 


is the sum, se’ + 10s:°, 10 being the number 
of birds per cage (Fisher's “Statistical Meth- 

~ ods,” section 40. Snedecor’s “Statistical 
Methods,” section 10.14. Winsor and Clarke, 
“A Statistical Study of Variation in the Catch 
of Plankton Nets,” Journal of Marine Research, 
3:25-27). Finally, the mean square for treat- 
ments may have the additional component, 
er’, with the coefficient, 20, the number of hens 
receiving each treatment. 

The three sample values of s* are each 
estimates oi variances, ¢’, in the sampled popu-- 
lations. The following table summarizes the 
argument: 
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Mean Individuals Mean square is 
Source of variation square per group an estimate of 
Treatments 1,074.5 20 o* + 100;" + 2007" 
Cages 575.9 10 o* + 100;° 
Individuals 297.8 1 


Now, F tests some null kypothesis; for 
example, that one of the o’s is sero. The 


According to the F-test, there is no sig- 
nificant difference; yet according to the t-test, 
there is a highly significant difference. What 
interpretation is possible under this situa- 
tion? 


ANSWER It is not informative to compare 
the two tests that you have made. The 
F-test is exact, the null hypothesis tested being 
this: The 8 treatment means are randomly 
drawn from a single normal population. The 
probability of a larger F from the hypothetical 
population is about 0.11. 


The hypothesis tested by t is that the 
two means are randomly drawn from a normal 
population. Since you selected the largest and 
smallest means for comparison, the tabulated 
probabilities of t are not applicable. Fisher 
has suggested a method for using the available 
tables for your purpose (Design of Experi- 
ments, section 24). The difference you have 
chosen is 1 of 28 comparisons that might be 
made among the 8 treatment means. It is pro- 
posed, therefore, that the probability to be 
required of the selected difference be not 1 
in 20 but 1 in (28) (20) =—560. Since the 
probability of your value of t is more thaa 
1 in 560 (it is about 1 in 460), the conclusion 
based on this t-test is about the same as that 
reached from the F-test. 


There is another test which is often use- 
ful: the range from the lowest mean to the 
highest, 173 — 12.7 = 46 tons, may be 
compared to the range expected in samples of 
& This may be approximated from Egon 


Source of variation Degrees of freedom Mean square 

Treatments 7 15.39 

Error 42 7.89 
F=19 5% F=2.24 


The treatment means were 14.0, 17.3, 14.4, 
13.3, 140, 12.7, 13.9 and 136 tons. The 
greatest difference is between treatments 2 
and 6, 46 tons, the standard error of this dif- 
ference being 1.404 tons. This makes t —3.28 
a highly significant value. 


S. Pearson's table A in Biometrika, 24 (1932), 
page 416. There it is shown that the range 
429 «, will be exceeded in 5 percent of 
random samples of 8 drawn from a normal 
population with standard deviation ¢. Using 
the sample estimate of ¢, \/7.89/8 = 0.99 ton, 


f 
| 
significance of differences among treatment 
= means is determined by testing the hypothesis, 
oz =O, in the ratio, 
+ + 2002" 
cet __ It is clear that if es" is sero, then this ratio 
is 1. The corresponding experimental value 
of F is 
1,074.5 
F = 187 
575.9 
bas 7 is this: what is the probability of a greater 
i 3 excess over 1 in random sampling from a 
population in which Comparison 
with the tabular value of F shows this prob- 
. s ability to be considerably greater than 0.05; 
eae that is, there is little evidence against the by- 
pothesis. 
The other null hypothesis tested above is 
that =O in the variance ratio, 
| ia for which F = 575.9/2978 = 1.93, as before. 
Grorce W. SnEpEcor 
Queny Eight treatments were compared in 
analysis of variance was: 
| 


the 5 percent range is (4.29) (0.99) = 4.25 
tons. Thus, the sample range slightly exceeds 
the 5 percent point. 


Summarizing: the statistical evidence is 
that under the hypotheses tested neither F 
nor t is as unusual as 1] in 20 but that the 
range exceeds the 5 percent point. As for 
interpretation, that must rest on biological 
considerations. Is there good reason for 
thinking that treatment 2 is outstanding 
despite the fact that apparently you had 
not suspected it? Is this in accord with 
other evidence? This treatment may be a 
genuine “find” which would cai) for more 
critical experimentation. On the other hand, 
the unusually high yield may turn out to 
be an experimental freak. 

Grorce W.:SNnepEcor, 
lowa State College. 


QUERY in comparing the reactions of soil 
samples should pH or the concentration ot 


the hydrogen ion be used in testing the signifi- 
cance of the difference between treatments? 


ANSWER So far as statistical convenience is 
concerned choose the variate that most nearly 
conforms to the mathematical model of the 
experiment. For example, if two or more 
groups of samples are compared, it is pleasant 
to have the variate distributed normally with 
equal variances in the groups. If regression 
is involved, the easiest relation to handle is 
the linear with equal variances for all values 
of x. For randomized blocks, one would like 
the deviations of the variate in the several 
plots to be randomly assorted and normally 
distributed. 

In my experience, the pH scale has most 
often met both the biological and statistical 
specifications. 


W. Snepecor, 
lowa State College. 
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LUSH, Jay 1. (lowa State College). Chance as 
e@ Cause of Changes in Gene Frequency Within 
Pure Breeds of Livestock. 


The N individuals reaching breeding age 
in each generation are a sample of two N 
gametes from the preceding generation. These 
N individuals in turn are the universe from 
which another sample of two N gametes (if 
the population is constant in size) are taken 
to constitute the next generation. The pro. 
cession of the generations is statistically the 
sampling of a sample from a sample, from a 
sample, etc. If the probability of becoming a 
parent of an animal to reach breeding age in 
the next generation were uniform for all those 
which reach breeding age in the parental gen- 
eration, the variance of gene frequency (q) due 
to chance would be q(1-q) in one generation 


2N 


and t times as much ia t generations, except as 
q(l-q) becomes damped down when q ap- 
proaches zero or 1.0. 


Actually the probability of achieving 
ponmeee varies widely among those reaching 
reeding age in the nta] generation. This 
makes the chance changes in gene frequency 


much larger than this formula indicates. 


Among the important causes for this wide 
variation in number of descendants are: (1) 
sterility of some members of the parental gen- 
eration, (2) correlation between the fates 
of relatives, (3) many are sold into grade 
herds where they have no chance to contribute 
to the future pure breed, (4) functional 
stratification of purebred herds into a few herds 
(“centers of radiation” they would be called in 
discussions of evolution) which sell mainly 
to other purebred herds which have for their 
main business producing sires for use in 
commercial herds, (5) fame of a few sires 
and dams to such an extent that nearly all 
their close relatives are eagerly sought for 
use in other purebred herds, while the rela- 
tives and descendents of the many sires and 
dams which do not achieve this fame go largely 
into commercial herds or into the purebred 
herds which sell to the commercial herds. 

Several examples of how much gene 
frequency may have varied by chance because 
of the extensive use of one or a few breed- 
ing animals are cited. A summary of the 
studies of inbreeding in 17 pure breeds of 
livestock indicates an average increase of about 
0.4 to 0.6 percent per generation in the inbreed- 
ing coefficient. This is equivalent to values of 80 
to 130 for N in the preceeding formula and 
would give the chance oq in one generation a 
value of 0.3 to .04 when q is near 5. 
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(Continued f-om page 15) 
tained at little extra cost. The ,* test (8) is 
the usual method for determining the agree- 
ment between observed ratios and those ex- 
pected on the basis of some genetic hypothesis. 
Possible linkage between the factors for two 
character pairs may be tested by calculating 
x’ for independence. If the two ratios appear 
the next step is to measure 


the extent of the linkage by computing the 
percentage of recombination. The preduct 
method (8, 13) is usually preferred for Fs 
data, particularly if suitable tables are avail- 
able (13). If data have been recorded on the 
F; as well, the method of maximum likelihood 
(8, 15, 16) is used to estimate the recombina- 
tion percentage which best satisfies all avail- 
able data. 
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